Bayesian Model Averaging for Biomarker Discovery from Genome-Wide Microarray Data

نویسنده

  • Ka Yee Yeung
چکیده

Gene expression microarray data has recently become a popular method for classification in a variety of diagnostic areas. Classification is the prediction of the diagnostic category of a tissue sample from its expression array phenotype given the availability of similar data from tissues in identified categories. A challenge in predicting diagnostic categories using microarray data is that the number of genes is usually much greater than the number of tissue samples available. Furthermore, only a subset of the genes is relevant in distinguishing different classes. The selection of relevant genes for classification is known as variable selection or feature selection. In cancer research and detection, a biomarker refers to a substance or process that indicates the presence of cancer in the body. High-throughput technologies (including microarrays) mainly contribute to the “discovery” phase of biomarker discovery, in which exploratory studies are used to identify potential targets. The objective of the “discovery” phase is to determine a shortlist of high-priority candidates. The number of such candidate targets is limited by the capacity of downstream target validation, which is time-consuming, costly, and labor intensive. Therefore, a small set of potential target genes from microarray studies is highly desirable for the development of inexpensive diagnostic tests. Furthermore, the merits of using a combination of biomarkers are well documented in literature. However, most biomarker combinations in previous studies are not systematically determined. In this chapter, Bayesian Model Averaging (BMA) is presented for gene selection and classification of microarray data. This method is a multivariate technique that considers multiple variables (biomarkers) simultaneously. Typical gene selection and classification procedures ignore model uncertainty and use a single set of relevant genes (model) to predict class. Bayesian Model Averaging addresses uncertainty about which set is best by averaging over multiple models (sets of potentially overlapping relevant genes). We show that BMA typically selects small numbers of predictive genes with relatively high prediction accuracy. We believe that our BMA framework is a powerful technique for the practical application of biomarker diagnostics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selective Model Averaging with Bayesian Rule Learning for Predictive Biomedicine

Accurate disease classification and biomarker discovery remain challenging tasks in biomedicine. In this paper, we develop and test a practical approach to combining evidence from multiple models when making predictions using selective Bayesian model averaging of probabilistic rules. This method is implemented within a Bayesian Rule Learning system and compared to model selection when applied t...

متن کامل

Predicting waste generation using Bayesian model averaging

A prognosis model has been developed for solid waste generation from households in Hoi An City, a famous tourist city in Viet Nam. Waste sampling, followed by a questionnaire survey, was carried out to gather data. The Bayesian model average method was used to identify factors significantly associated with waste generation. Multivariate linear regression analysis was then applied to evaluate th...

متن کامل

Predicting CpG Islands and DNA Methlation in the Cow Genome Using DNA Microarray Meta-Analysis and Genome Wide Scanning

DNA methylation is a type of epigenetic changes that directly affects DNA. In mammals, DNA methylation is essential for fetal development and stem cell differentiation and this phenomenon essentially occurs within the CpG islands. In this study, two methods were used to study the DNA methylation profile of cow genome. In the first method, the DNA methylation profile of the differentially expres...

متن کامل

A new 12-gene diagnostic biomarker signature of melanoma revealed by integrated microarray analysis

Genome-wide microarray technology has facilitated the systematic discovery of diagnostic biomarkers of cancers and other pathologies. However, meta-analyses of published arrays often uncover significant inconsistencies that hinder advances in clinical practice. Here we present an integrated microarray analysis framework, based on a genome-wide relative significance (GWRS) and genome-wide global...

متن کامل

Estimating gene regulatory networks and protein-protein interactions of Saccharomyces cerevisiae from multiple genome-wide data

MOTIVATION Biological processes in cells are properly performed by gene regulations, signal transductions and interactions between proteins. To understand such molecular networks, we propose a statistical method to estimate gene regulatory networks and protein-protein interaction networks simultaneously from DNA microarray data, protein-protein interaction data and other genome-wide data. RES...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010